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[Plate L] 

(1) In a memoir presented to the Royal Society in 1894, I dealt with skew variation 
in homogeneous material. The object of that memoir Avas to obtain a series of curves 
such that one or other of them would agree with any observational or theoretical 
frequency curve of positive ordinates to the following extent : — (i) The areas should 
be equal ; (ii) the mean abscissa or centroid vertical should be the same for the two 
curves ; (iii) the standard deviation (or, what amounts to the same thing, the second 
moment coefficient) about this centroid vertical should be the same, and (iv) to (v) 
the third and fourth moment coefficients should also be the same. If jul, be the s^^ 
moment coefficient about the mean vertical, N the area, x be the mean abscissa, 
or = \/juL2 the standard deviation, /3i = m///x/, /34 = M4/m2^j t^hen the equality for the two 
curves of N, ^, o-, /3x and ^2 loads almost invariably in the case of frequency to 
excellency of fit. Indeed, badness of fit generally arises from either heterogeniety, 
or the difficulty in certain cases of accurately determining from the data provided the 
true values of the moment coefficients, e.g., especially in J- and U-shaped frequency 
distributions, or distributions without high contact at the terminals ; here the usual 
method of correcting the raw moments for sub-ranges of record fails. 

Having found a curve which corresponded to the skew binomial in the same manner 
as the normal curve of errors to the symmetrical binomial with finite index, it occurred 
to me that a development of the process applied to the hypergeometrical series would 
achieve the result I was in search of, i.e., a curve whose constants would be determined 
by the observational values of N, x, cr, ji^ and (^2- 

The hypergeometrical series was one not only arising naturally in chance problems, 
but covering in itself a most extensive range of functions. The direct advantage of 
the hypergeometrical series is that it abrogates the fundamental axioms on which the 
Gaussian frequency is based. The equality in frequency of plus and minus errors of 
the same magnitude is replaced by an arbitrary ratio, the number of contributory 
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430 PEOF. KAEL PEARSON ON SKEW YAEIATION. 

causes is no longer indefinitely large, and the contributions of these causes are no 
longer independent but correlated.^ 

Since /3i and ^^ ^^^ by nature positive we can represent all possible values of /3i on a 
chart in which /3i andjSs are the co-ordinates of a point in the positive quadrant. But a 
little consideration shows that ^2 niust be greater than /3i, thus one-half the area of the 
quadrant, that above the line ^2 — A is removed from the field of possible occurrences. 
Further, there is a limit to the application of the series of curves discussed when ^2 
gets large, for the high moments of two of the types of curves, ix,, Types IV. and YI., 
or 

\ aV 

become infinite when the order of the moment is greater than r, or the probable error 
of the fourth moment would become indefinitely large for r ^7 , i.e., we are practically 
limited by the line 8/32—15^1 — 36 — 0. The first four moments of the curve remain 
finite, but from the fifth onwards they can become infinite, the lines corresponding to 
these, however, lying outside the above line.f For curves corresponding to points 
below this line it is fitting to take as differential equation 

1 d^ __ b + x /.x 

y dx Cq + CiX-\-C2X^ + c^x^ ' 

or a slightly more general form which is related to the higher hypergeometrical 
F(a,/3, y, 0, e, 1) as the present series of curves to the simple hypergeometrical 
F (a, ^, y, 1). The whole theory of curves of the above type has been worked out for 
some time past, but has remained unpublished, for we failed to find any definitely 
homogeneous data by which it could be efiectively illustrated, and for this reason 
heterotypic curves have for the time being been left in abeyance. We may, however, 
notice the following point. If we take our generalised hypergeometrical to be 

1 . o^'^^y , (Q6+l)(/3+l)(y+l)a.^.y 

e.e,^ "^ (0+l)(e+l)(f-fl)0..;.f **• 



e • » • 



Then 

y^+i __ {a-{-x){f3-hx){y + x) 

■ , !■» — I.— -I. .I.-.. I I .1.1.1 ..I... -. — m .i I ■ I.-. . . „ g 

Vx {e + x){e-{-x){^ + x) 

and this will correspond to the ordinary form if f = 0, i.e., F (a, /3, y, 6, e, l). 

* Just as values of the binomial (p + qY^ with negative n and j-?>l very often give good fits to frequency 
distributions, so we have recently found that hypergeometricals F(a, /^, y, 1) with imaginary a and /^ are 
of fairly common occurrence in frequency distributions, and when applied to individual samples from real 
hypergeometrical populations may give better fits than the theoretical series, i,e,, in card drawings. 

t See Ehind, ' Biometrika,' vol. YIL, p. 133, 
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We have 



i (?/:.+] +?/:.) {a/3y + 0e^+ir(a^ + /8y + ya + 0e + ef+f6>) + X^(a + /3 + y + + e + O + 2i^^ 

and accordingly we get the curve approxnuating to the hypergeometrical of the higher 

order by putting 

1 ^ — q^Q^dratic function of x 
y dx cubic function of x 

where the six independent constants can be expressed in terms of the original six, 
a, /3, y, 0, e, f. It will be seen that a hypergeometrical of ^the second order will, in 
general, have two modes, the exception being when 



a 



+ /3 + y = + e+f; (iii) 



in which case (ii) coincides with (i) the general equation to the fourth approximation 
of curves when /3i and ^^ fell ii^to the heterotypic area. It will thus be noted that such 
curves approximate to hypergeometric series of the second order when the special 
condition (iii) holds ; always assuming the unimodal character of homogeneous material. 
It seems probable that for the most part bimodal frequencies would be those that lead 
to values of ^i and ^^ lyi^^g i^ the heterotypic region, and such are excluded from 
practical statistics. 

In the original paper ^ four types of curves were dealt with beside the Gaussian 
curve corresponding to an isolated point. A supplementary memoir issued in 1901 1 
dealt with two further types, which had been overlooked until actual experience 
demonstrated their existence. I have now to confess the omission of five further 
types, not to speak of a horizontal straight line, as sub-groups of the J-section of 
curves, which are themselves in practice so rare, that the region of the /3i, /^^ plane in 
which they occur had not been very fully investigated. My attention was drawn 
to these curves while considering the frequency curves for the correlation of small 
samples. If we take a sample of four from uncorrelated material, the sample is equally 
likely to have every correlation from —1 to +1.J In this case, ^i = 0, /32 = 1*8, and 
the frequency curve is a horizontal straight line. What would my series of curves 
give in this case ? I discovered that they also gave a rectangle of frequency or a 
horizontal straight line, and this discovery led me to a closer investigation of the 
sub-groups of curves in the neighbourhood of the J-curve area. The point in the 

^ ^Phil. Trans./ A, vol. 186 (1895), pp. 343-414. 
t 'Phil. Trans.,' A, vol. 196 (1901), pp. 443-459. 
X ' Biometrika,' vol. VI., p. 306, and vol. X., p. 312. 
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/3i /32 plane for which /3i ~ 0, /Sg = 1*8, I term the rectangle-point and denote by R. 
(See folding diagram, Plate 1, at end of paper.) 

The rectangle-point is the point of contact with the axis of /S^ of the biquadratic 

/3,(8A-9A-12) (/32+3)^ = (4A-3A) (lOA-12/3l-18)^ 

which bounds the area of J-curves. The novel curves are in part limiting curves 
which occur when the point ^i, ^2 ^i^s on this biquadratic, i.e., transition curves from 
J-curves to U-curves and from J-curves to limited range curves, and in part a 
limiting curve v^hlch exists along the line 5^2~~6/3x — 9 = which passes through the 
rectangular point and never again meets the biquadratic in the loop in the positive 
quadrant. It w^ould be convenient to speak of this line as the axis of the biquadratic 
loop, but unfortunately the loop is not symmetrical about it, and to avoid misunder- 
standing I term it the R-line. 

Up to the present the minimum limit to the area of U- curves had not been given. 
Since /Sg is > /3i, half the positive quadrant was impossible, but a recent observation 
shows that frequency curves above the line /32-/3i~~l = are impossible. This limit 
was suggested in the following manner. When samples of three are taken from an 
indefinite population, the frequency curves for the correlation of any two variates of 
the three individuals sampled are U-shaped frequency curves, but when samples of 
two are taken the correlation must be either positive or negative, and accordingly 
the frequency is collected into two lumps or blocks as a limiting case of a U-shaped 
distribution. But for two such lumps /32— /3i--l ~ 0. In other words, along the line 
/32— /3i — 1 ~ 0, the U-shaped frequency either brings all frequency to an end, or 
passes through a transitional case. The former is the true state of affairs, for /S^ 
cannot be less than /8i + l. To d.emonstrate this,'^ let s^ ~ S (ir/), and let there be 
71 quantities x^. Clearly, ^o ~ ^j ^^^ ^i — 0- Now by Buenside and Panton, 
'Theory of Equations,' vol. II., p. 35, 



n 

r > s > t 

r, s, t — 1 



iV'^s ^t) \^t *^r) \^r ^^s) J 



"^0? '^15 ^2 
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I owe this neat proof to tlie kindness of Mr. G. N. Watson. 
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which must therefore be either zero or a positive quantity. Thus we see that the 
whole area covered by my frequency curves is hmited above by the hne /32-- A""! = ^^ 
and below by the line 8^2—15^1 — 36 = 0. The first line limits all frequency ; the 
second line limits my types. '^^ 

(2) Before proceeding further, let us examine the limit to all frequency. Consider 
the line ^3— A — 1 = 0. 

The form of the curve isf 



Now. 



where 



6(^,-/3,-1) 



r 



3A-2A + 6 



therefore 

r = and € ~ i^'V(l'~'^2) also — 0. 
Hence 

m\ + m^2 = and m\on^2 = 0? or m^ = — 1, m^ — — 1 
The form of the curve is accordingly 

y = 



or, apparently, U-shaf)ed. Now 




and is finite. But 



6 = 1^ {/3i(r + 2)^-f 16(r+l)}'^ 



But 



- N m^^'^m^"^' r(mi+m2 + 2) 

^' " 6 {m, + m^)"^' + "^^ r (m^ + 1 ) r {m, + 1 ) ' 

— N (mi + l) (m^+l) m^'^'m^'""' (^, + 2) (m2 + 2) r(mi + m2 + 4) 

6 mi + m2 + 2 {m^^-m^Y''^'"'' (^1 + ^2 + 3) F (mi + 3)x T (mg + S) 

= N ij^.^ ^f K + l)(m,+ l) 4xr(2) 
h my + m, + 2 r(2)xr(2) 

limit of K+i)K+i) =. _M^, 



■^ It is not accurately correct to say it limits my types of skew curves. What it actually does is to cut 
off an area in which the probable errors of the constants of Types lY. and YI. curves can be very great. 
The curves may give a good fit, but the constants cannot be cited as characteristics of the frequency 
distribution as tliey are unstable. 

t The notation throughout is that of my original ' Phil. Trans.' memoirs of 1895 and 1901. 
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when both X1X2 are to be made vanishingly small, being m^ + 1 and mg + 1 respectively. 

Thus the limit 

1 1 



1 Ai + 1 A2 ^ 



= 0. 



Hence y^ vanishes or y is zero at all points, but x ^ —ai and x — a2 where it is 
undetermined. 

Since mi/^i ~ "1^2/ (^29 we have a^ = ^2, and the frequency really consists of two 
concentrated groups at —ai and a^, or at ±^h. 

If /ul\ and /uL^\ be the distances of the centroid from the two ends of the range. 



JUL I n 

n ~~ "17' 

yot 1 n 



where n' and n!^ are the frequencies^ concentrated at the range terminals. But 
^\ = 6(mi + l)/(mi+m2 + 2)5 or we have mVm^^ = (^i + l)/(^2+l) ~ W'^s? or is the 
finite quantity which marks the ratio of the vanishing of mi + 1 and ^2+1; this, 
therefore, is equal to n^^/n\ 
Clearly 

f.', = {n" + n')/{n" + n')^ = ih\ 



^'^^{n"-n')l{n" + n')\ 



3 
J 



and 



Thus 



lb 

u^, = lj'{n'n")l{n' + n'J, 

^3 = Vn'n"{n" —n')j{n' + n"Y, 

f,^ = b'n'n" {n" + n"'-n'n")/{n' + n"y 



n n n n 



giving as verification /Sg— /3i — 1 = 0. 

Thus the wdiole problem is solved if we know the magnitude of the two frequencies 
v! and n^^ concentrated at —^h and +-|-&. 

As special cases the point on the /32-axis gives /3i =0, ^2 = 1? ^.nd represents two 
equal concentrated frequency lumps n^ = n!' = ^'N. The point at co on the line 
/32~-/3i — 1 = 0, or ^1 = /32 = ^ represents a single frequency lump, for which n^ = 0, 
n^^ = N. I speak of these concentrated frequency lumps lying on the line /Si—jSg— 1 = 0, 
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as block-frequency, and represent them by the letter B ; they correspond to points on 
the B-line. (See Diagram, Plate 1.) 

The most remarkable limiting case of this kind has been already referred to. It 
will be shown in practical examples in a memoir on ^^ small samplee,^^ now nearly ready 
for press, that the correlation between two Tariates may be determined by sampling 
these populations in pairs, and merely observing, which can be usually done without 
measurement, whether the pair is positively or negatively correlated. The ratio of 
the two frequency ^' lumps " easily provides the correlation.^ 

(3) Let us now consider the nature of the frequency on the loop of the biquadratic. 
Taking the form of the curve to be 

we know that nii and Wg are the roots of the quadratic 



where 
and 



m^— m (r— 2) + e— r+l == 0, 
r = 6 (/32-^i-l)/(3/3i-2/32 + 6), 



^3 



e = 



4+iA (^+2)V(r+l) 



Now e— r + 1 = provides the biquadratic 

A (8^,-9A-12) (A + 3)^--(l0^2-12A-18)^(4/32-3A) = ; 



actually 



(3A™2/3, + 6) {A (A + 3f + 4/3i (4/32-3/3i) (3^-2^2 + 6)} 



Now j8i, 4i82— 3/3i and /Sg+S are by their nature essentially positive. Hence, 
provided 3/3i— 2/3^+6 is positive, ^'.e., as long as we deal with points above the line 
2i82~-3^i— 6 = 0, i.e., the Type III. curve line, e— r+1 will be positive, if (/3i, ^2) li^ 
outside the loop of the biquadratic. But within the loop it is negative, or one value 
of m must be negative, or we reach an infinite ordinate at a? = — ai or %, ie., a 
J-shaped curve. The other ordinate at cc = as or —ax is zero, because the other m must 
be a finite positive quantity. 

If €— r+1 = 0, ie., along the biquadratic loop, one value of m is zero, and the 
other is positive if r be greater than 2, and negative if it be less than 2. But 

3A-2^2 + 6 

Accordingly above the line 5/32-6^i-9 = 0, and above the line 2^2--3^i-6 = 0, r-2 
will be negative, but these lines do not meet in the positive quadrant. Hence all 



* 



See "Student," ' Biometrika,' vol VI.» p. 304, and Fisher, 'Biometrika/ roL X., p. 608. 
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along the upper boundary of the loop one m is zero and the other negative. 
Accordingly, from the R-point round the upper boundary of the loop, we have the 
curve 

I call this curve Type VIII. 

Since —mfijai — vn2la2^ and ?7^2 is zero while nii and ai are finite, it follows that 
0^2 = 0, and accordingly the range of frequency is from a; — to ^ =:= —a^. The curve 
is therefore a J-shaped curve with infinite ordinate at one end of the range and a 
finite ordinate at the other. 

Now consider the lower side of the loop. Here 5^2~"6/8i — 9 will be positive, for 
this side is below the R-line and 3/3i — 2^2+6 will also be positive until the point 
in which the line 2/32-~3/3i — 6 = meets the lower side of the loop, i,e., the point 
/3j r= 4^ ^^— 9. Hence from the R-point up to ^i = 4, /Sa ~ 9, a point practically 
outside the range of the customary statistical frequencies, r— 2 will be positive, or 
nil will be positive. Further mi and a^ being finite and ^2 zero, it follows that (^2 is 

zero, or the curve is 

y^y,{l-\-xla,Y\ 

In this case the curve has a zero ordinate at one end and a finite ordinate at the 
other. I term this curve Type IX. 

At the point where the line 2^2~3^i-6 ^ meets the biquadratic, Type IX. 
agrees with my earlier Type III. 

The equation to that type is^ 

y=^y,{l-\-xlay-e"'\ 
where 

ya:=^ — -~'\ and y ^ — j~ » 

Pi (TV Pi 



Hence for /3i = 4, ya == 0, and y = l/a-. Thus a is zero and the curve becomes 



y = Voe "^^ 



the range being from to cxd . 

But in Type IX., since r has become infinite, m^ is infinite and the limit to 

^ = 2/0(1 +^My- 



is accordingly the exponential curve 



y = ?/o^ ^'^ 



as we shall see shortly X must equal l/a, where a- is the standard deviation. 

I propose to call this exponential curve Type X., and the point /S^ = 4, ^2 == 9, E or 
the exponential point. 

* 'Phil. Trans.,' A, vol. 186, p. 373. 
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Beyond the exponential point, our biquadratic branch has entered the area of 
Type VI. curves,^ and rn^ will now again be negative. 
Now the equation to Type VI. is 

and the range from ^ = a to co . The special case of this along the branch of the 
biquadratic occurs when gg = 0? leading tot 



or 

wiiere 



'I 



g, ^ ej =:r-l, 



which is positive, since g^ is now beneafch both the lines 



2 



6^,-^g ^ and 2^2-3A--6 = 0< 



This curve, which will be more fully considered below, has a range from a certain 
value a to oo. It thus starts with a finite ordinate and asymptotes to zero. It is 
a transition curve extending from the exponential point along the lower limb of the 
biquadratic loop. I call this curve Type XI. Tiie biquadratic never cuts the cubic 
along which Type V. lies and no further change occurs in Type XI. 
I now pass to the consideration of the li-line or 5fi2'—Gf3i — 9 = 0, 
The general differential equation § to the type of frequency curve under con- 
sideration is 

1 ^ ^ -i'^K i^2± 3) + (lOA- 12/3,-- 18) x/a^} 

the origin beinsf at the mean. 

Hence if 5^2— 6^j — 9 == 0, the term in xja^ disappears from the numerator, and we 
can further get rid of /i, by substituting 1(6/3, + 9) for it. Making this substitution, 
we reach 

}_dy _ — 2v^^i 

// dx ^ a- (3 +/5,)^a- {\/J,-xIg-Y " 

"■^' ' Phil Trans./ A, vol. 197, p. 449. 

t Zoo. cit^ Equations, bottom of p. 449. 

X As we pass outwards from the exponential point along the biquadratic gi ranges from oo to 5, which 
it reaches at the asymptote to the biquadratic /?i = 50, or when /?2 — c^. Pi — 50. 

§ " iMathematical Contributions to the Theory of Evolution, XIV. On the General Theory of Skew 
Correlation and Non-Linear Kcgression,'' p. 6, ' Drapers' Company RcBoarch Memoirs,' Cambridge 
University Press. 

VOJ-.. C0XVI.~A. 3 () 
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TliLs leads on integration to 






1 term this Type XII., oi^ the R-Hne J -curve. The origin is the mean, the range 



from X = — (t(v3 + /3i + V /3j) to a-(\/ 3+/3|~-~\//3]). It separates J-curves— so long as 
we are above the line 2^82—3,81 — 6 = — for wliich r--2 is positive from those in which 
r—2 is negative. But r — 2 ^ ni^-l-nio. Hence below the li-line the positive ^/i, is 
greater than the negative nu, but above this line the positive m^ is less than the 
negative ms, ix.^ the upright of the J is emphasised at the expense of the horizontal 
part, while below the Rdine this condition is reversed until on the biquadratic the 
infinite ordinate of the J -upright is replaced by a finite ordinate. 

I propose now to consider a little hi detail the nature of these new types of frequency 
and the manner of fitting them to actual data. I have dealt above sufHcientlj?' fullj^ 
with "block-frequency" and its criterion ft.—(3i — l^-0 and therefore need only 
consider Types VIIL to XIL 

(4) Frequency Curve, Type VllL— 



■ m 



je, from x ^ Q to x ^ - a,^^ 
^/o is clearly the A^alue of the ordinate at x --= 0, i,e,^ the finite ordinate at the tail. 
We easily deduce if N be the total frequency //,,-=" N (l--?>i)/a, and taking the 
origin at n = —a, 

X — juL^y ==: a {l—in)/{2 — )n), ^^^ =^ ^^^^ (l — /n)/(3— m), 
1^/, ~™ a''(l — 77i)/(4 — m), f/y ^ a^ (l — m)l{b—rn). 

lience for the moment-cocfficit^nts Jibout the mean 

rj^ ^ fi. - ct ( I -rn)l{Z-rn) ijZ^^^^nif] .^ 
JULo ^ 2a^m (1 — m)/{(4 — m) (S — m) (2— m)^}, 

These lead to 

R ^ 4:}n^S-m) ^ ^ 3(3-?H)(4-5mf 37M^) 

'^'"" (l-m)(4-m)'' '"'"^ {l^m)(4-ni){5-m) ' 

Clearly ni could be found from the value of f^i by solving the cubic equation 

'}v;' {4:-(3,)\-m' (9/^i- J 2)-2if3,m'{^ 16/3, =: 0, 

* Of course, wlietlier a is really positive or negative will depend on the sign given to x, or tlie direction 
of the r/'-axis. 
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then a is determined from 

a '^±(T (2—ni) a/ ^ , 

V 1— m 

the sign being determinable from the observed vaUie of /n>^ and ^/q from 

^ N ( l-m) _ N 1 -m / l-ni 



?/i 



a (T 2—771 V 3—m 



V 



and the placing of the frequency curve on the observations by 

/ui\ ~ a ( J. — m)/(2—m.). 
If, however, we find lO^^—l^/?! — 18 and 3/3i— 2/5^,-f 6, we have 

^ (l-m) (5-■'??^) (4-m)'^ 

(I—m) (5— m) (4—m) 



givmg 



3^1-2/32 + 6 



and thus since m is to be positive, the point (/3i, /S,) must be above the line 

5/32-~6/3i-9 = 0. The line 2(32-3/3,-6 == does not meet S^^-^A-y = in the 

positive quadrant, so that a point below both these lines does not exist in real 

frequency. Clearly 

l--m = (8/3,-9/3i~"12)/(3A-2/3,+6), 

3-^m - (4/33-3/30/(3A-2/32 + 6), 

4-m = 2(/82+3), 

and thus if these values be substituted in /3| as given above, we reach 

A (/32+ 3)^ (8/32-9/3,-12) = (4/32-3^) (10/32-12/3^-18)^ 

the equation to the biquadratic, proving that the point associated with the above 
frequency curve lies on the biquadratic. 

Again 1— m will always be positive, or m less than unity. For the upper branch 
of the loop of the biquadratic lies below its asymptote, or 8/32--9A — 12| = 0, and 
accordingly below the line S/Sg— 9/8i — 12 = ; thus the numerator of 1— m is always 
positive. So also is the denominator, for the upper branch always lies above the line 
2^2-3/3i-6 = 0.^ 

* In fact the R-line (5/^2 - 6/^i -9 = 0) the parallel to the asymptote (8/^2 - 9/^i -12 = 0), the limiting 
frequency line (^2 - /^i - 1 = 0), and the Type III. line (2/^2 -3^8-6 = 0) meet in the point /32 = - 3, 
^1 -= - 4 of the negative quadrant and the upper branch of the loop lies in the angle between the first two 
and in the positive quadrant. 

3 o 2 
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As m is positive and less than unity tii(3 area and nionients of the curve are all real 
and finite. When the point {fi2, /3i) moves along the loop of the biquadratic towards 
the R-point (/3y -= 1*8, /3j ^ O)^ the value of 1—m becomes more and more nearly 
unity, and ultimately at .R we have ni == 0, or tlie frequency curve is 

a rectangle, i.e., we reach the rectangle point. If on the other hand we move towards 
infinity along the upper branch of the biquadratic loop, we find 1—m approaches the 
value l/fti and thus ultimately becomes zero, or m -^ 1. Thus the limiting form of 
the frequency curve is a rectangular hyperbola, or rather the part of such hyperbola 

from the vertical as3^mptote x -^—a to x ^- 0. 

But this is clearly only a theoretical limit, for it involves /3, ::^- fi, - oo^and this 

means that i£ fM be finite, /x..^ and /x^ are infinite -results impossible in any actual frequency 

if the population be finite. It is clear indeed that /j^ must be less than N, for 
obviously N/x.^ <NV2^ Again, ,B^ is < /^\— 1, and accordingly /3i<N — 1.^^ But these 
limits are of small service for practical statistics, where even for small samples, say, 
N ™ 20, they would scarcely ever be approached, t Thus the rectangular hyperbola can 
only be treated as a limiting form of Type V'llL far beyond the region of actual 
statistical experience. J For practical purposes the point is that m is limited to 
values between and 1, or Type VIIL ranges from the rectangle to the rectangular 
hyperbola. The suggestiveness of this is that curves in the Ij; and the Ij areas, i.e., 
above and below the upper branch of the biquadratic loop, must approach these types 
as they approach the extremes of this branch. Generally a U- curve near the biquad- 
ratic will be close to a curve resembling a curtailed hyperbola. 

■^ Mr. G-. N. Watson- has given me a nc xrer limit to /:^:), namely, f^-2 ^ N - 2 + i^f^T- J^^^^? except as 

showing that y^2 must be fmite, which is otherwise obvious, this is again of no real service. 

t The highest observed values that I know of for fi2 ^n.nd fSi are those given by Dunckrr (/ Biometrika,' 
vol. YIIL, p. 238). He gives 

' AYmv/dhl,' A sterina exigiia N - 600 /^2 "-"• 33-13, /ii=-l-7G, 
Arclmster f'l/picus N - 902 /^. -^ 128-48, f^i - 4-76. 

There are only three groups of frequency in each, 4, 5 and 6, and the bulk of the observations are 
concentrated in 5. The observations do not give, as he suggests, Pearson's Type IV. and Type YI. 
curves respectively ; the k2 in hoth cases is less than unity, corresponding to Type IV. But both fall into 
the heterotypic area of Type IV, The attempt to fit with heterotypic curves would hardly be profitable 
until there was absolute certainty that the group with 4 ' Armzahl ' was not the result of accident. 

X Theoretically very high values of ,Si and (^2 can easily be found, i.e., for samples of four, when the 
population sampled has, say, a correlation of 0*98; here the frequency curve for the correlation 
coefficient gives ^^j - 203*325 and 6^2 '-^- 311-731, but it is the rapidly approaching zero of /^.2 which leads 
to these results, 
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In concluding our discussion of this curve we may note that, perhaps, the easiest 
way of tracing the biquadratic is to calculate ^^ and fi^ from 

B = 4(2y-l)^(y+l) ^ ^ 3(y+l+A) ^ 3 (y-fl) (l6y^-- 13y + 3) 
3y~-l ' ' 3 — y (3y — 1) (S-y) 

by giving a succession of values to y, 

B^or y rr: 0*5 to 3 we get the points on the lower branch of the loop ; for y = 0'5 to 
0*3 we obtain the points on the upper branch of the loop. It will be seen that this 
amounts to taking the origin at /^a ==—3^ /3i =—4, and rotating a line through this 
point round it to intersect the curve. The slope of this line to the ^^ axis is 3/(3 — y). 

The cubic, it may be here noted, which gives the Type V. curve may be traced from 

o — y 3 — y 

Here y must be given values from 1 to 3. 

The Type III. line, which passes through the (laussian point, also passes through 
/32 = — 3 and /3i=-~-4, and the above means of getting at the points on the cubic 
corresponds to finding the points in which a straight line passing through ( — 4, — 3) 
and rotating from the position of the Type III. line cuts the cubic — its slope in any 
position being as before 3/(3 — y). 

Actually if 9 be the angle between the above line from ( — 4,-3) to the cubic, i.e,, 

tan r= 3/(3- y), 

r == 12 (sec — cosec 0), 

but to use this polar equation lias not been found a very ready manner of plotting the 
cubic. ^^ 

(5) F'reqitency Curve. Type- IX. — 



y = ?/o(l-f 



ry. ytn 



a, 



Range from ./' = — a to x ~ ; y is zero at one end of the range and equal to y^^ at 
the other. 

The analysis proceeds precisely as in the case of the curve of Type VIII.. except 
that m is now opposite in sign. We have 

y^ = N(l+m)/a, 

x(=: distance of mean from point x = — a) == a(m-f l)/(m + 2), 

^2 .= ^^ = r/(m+l)/{(m + 3)(m + 2)'}, 

^3 — — 2a%L (m+ l)/{(m + 4) (m-f3) (?n + 2)^'}, 

^^ =: 3a;' {m + 1 ) (3m^ -f 5m + 4)/{(m + 5) (m + 4) (m + 3 ) {m + 2)^ } , 

^ The parts of the cubic and the c[uartic lying in the other three quadrants have been plotted by 
Miss B. C. B. Cave. Geometrically the interrelations of the two curves, their asymptotic and other 
critical lines are of much interest, but until some interpretation can be put on imaginary values of the 
moment coefhcients, these interrelations have no statistical bearing. 
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leading to 



Thus 



^^ """ (m4'l)(m + 4)'' ' "^ (m-f l)(w + 4)(m+5) 

m^(/3j---4) + 3m'(3/3^-~-4)-|^24m/3ii-16^i =- 



would give m^ and a would bo found from 



ct ^ +o-(m4-2) 




*■> 



mjKi 
m + 1 ' 



the sign being found from the observed value of /xg. Lastly 



— -^ ^^'^ •^" 1 /m+ 1 
fT m + 2 V m-h3 



Practically it is better to determine m from 

^, ^ 2(5^,-6^-9) 
3A-2/5, + 6 ' 

which value of m substituted in the expression for /3i gives the biquadratic. 

Clearly since the lower branch of the biquadratic lies below the line S^^ — 6/3i — 9 = 0, 
m is positive until the line 2/32— 3^i — 6 = is reached, and in this section of the 
branch, i.e., from m ^ to m ^^ oo^ or from /S^ = 1'8, ,8| =r o up to ^g ^^^ 9, /3i == 4 (the 
exponential' point, E) occurs an interesting isolated point— the line-point L. When 
/Sg -■ 2*4, /?! = 0'32, then m ™ 1, and Type IX. degenerates into a sloping straight 
line, y == y^^{l+x/a), or the frequency Ime is 



^7 drrV-""o.^/2crJ 



Up to the dine-point, Type IX. curve rises at x ^- —a perpendicular to the axis, 
of X, at the line-point it makes a finite angle less than 90 degrees, and after the line- 
point Ave start with contact at x = —a. 

It is interesting to note the sloping line arising as a case of these generalised 
frequency curves, and we observe that its locus is separated from the rectangle locus 
by a considerable interval along the biquadratic in which the curve of Type IX. is 
very trapezoidal in form. 

(6) Frequency Curve of 'Type X, The Exponenticd Curve. — -Beyond the line-point, 
L at 1^2 ^ 2*4, /3i ~ 0'32, we reach as m steadily mounts a series of frequency curves 
which culminate in the exponential curve at E or ^83 — 9, /3i ~ 4. 

Clearly 

,, ^ ^'Tt±l ^ Nm+l /m + 3 ^ N ^^^^^ ^ j^ .^g^.^^_ 

a o-m + 2 V m+l a 
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Further 

y = -[i± 



a- \ o-{m + 2), 

N 



^±xjar 



the range being from cr; — to — oo^ if we take the positive sign— and from x ~ 
to + co^ if we take the negative. It is thus sufHcient to consider 

N 

'J — ^-' ? 

(T 

with range from x = to ^r = + ex? . The fii'st two moments of the area about x = 
are vi ^ or and i/o = a^ Thus x =^ a- and /^^ = ^^\ as it should. Lastly, ms = 2cr'^ and 

The fitting of the exponential curve presents no difficulty. 

The exponential point E is a transition point of great interest as being even more 
than the Gaussian point C — ^the meeting point of many types. At .E, Type IX^ 
changes to Type XL, but at P] the familiar Type III. passes from a zero ordinate 
at the limited end of the range to a J-curve with infinite ordinate. Further, E is a 
point at which the areas of Type L (Type IjJ as a limited range with zero ordinates at 
its terminals, and as a limited range with one infinite ordinate at a terminal (Type Ij) 
meet. Finally, Type VI. area, which lies between Type III. line and Type V. cubic, 
is divided into two sections by Type XL, which lies along the lower branch of the 
biquadratic loop below E. Below the biquadratic, Type VL takes the form 

y =:: 'ik{x-aYlx'^\ 

with a range from ./; = a to go , q^ and g.^ being both positive. In the area, however, 
Ijelow Type 11 Ij and alcove 1 ype XL, Type VL takes the form VIj^ or the J~shaped 
curve 



y 



x^' {x-aY'' 



with a range from x ~ a to x = oo. In this case r -• 6 {^2-/3i-l)/{B^i-2^2-^6) will 
be negative, since we are below the line 2/^1-3^1-6 = 0. Further, e is negative 
since we are above the cubic or Type V. branch 

4(4^,-^^3A)(2^,-^3A-^6) - A(^,+ 3^ 
Thus our quadratic 

m^^— tW + e ^ 0, 

corresponds of necessity to real roots, of which one will be negative and the other 
positive. The positive root will be 



i-(V'r^-~46 + r), 
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and is therefore numerically the smaller root since /' is negative ; it will be less than 
tinityj and therefore m/—l = m will be negative if 

or 

e-r+l>0, 

but this is the condition for the point /^i, /S^ lying inside the loop of the quadratic. 
Thus in this case we reach the J -shaped curve of Type VIL, or 



III ordei; that the area of this curve and its nionients should be finite, it is clearly 
needful that q^ should be less than unity, 

(7) Frequency Curve. Type Xi.— Beyond the exponential point the lower branch 
of the biquadratic is below the line 2/32 — 3/9, — 6 ™ 0? and consequently m is again 
negative and the curve takes the form 

where 

r^Y^ — - „ !i 1 £ — ^ ...1 i., „ L 

The ra,nge is, however, only limited in one direction, it is from x ^ b to x ^ <^y say. 

This lower branch of the biquadratic loop tends to become vertical and asymptotic 
to the line /3i ™ 50. Hence m takes all valu.es from cxd down to 5. 

Clearly, for moments about x ^ h, 



m~~p — 1 ft"'"^ ^ ' 

and these will be real and finite if p < r/i -1, or only the fourth moment would fail 
at the limit /S^ = Qo ^ which indeed cannot in practice be reached, ..4.t the same time 
if we want the probable eiTor of the fourth moment to be finite, it is needful that /u^g 
sliould be finite or we must have m > 9. Thus m = 9 nnist be where the curve passes 
into the heterotypic region and becomes of doubtful application. 
We easily find from the above result for /x'^, 

;,^^ ^ 3¥ (rri- 1) (3m'~ 5m -f 4)/{{m-2Y (m-™3) (m-4) (ni- 5)}, 

leading to 

R 4m' (m- 3) r. ^ 3(m-3)(3m'--5m + 4) 
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Thus for m = 9 we find /^^ = 972, ^2 = 22*725, which satisfy the equation 
8/32— 15/3i— 36 = of the heterotypic Hne. 

m may be found from 

^ ^ 2(5A-^6A-^9) 
2/32-3/3i~6 ' 
or from /3i alone by the cubic 

m^(4~/3i)+m'(9/3i-12)--24/3im+16/3i = 0, 
then 

i> = ±cr(m—2) A / --? 

V m — 1 

and , 

while the mean x = b (m — l)/(m — 2) enables us to place the curve on the observations. 

There is no discontinuity in the form of the curve down to m = 5, but only 
discontinuity after m = 9 in the probable errors of its moment-coefficients. 

The curve starts with a finite ordinate and meets that ordinate at a finite angle '; it 
asymptotes to the ^-axis at ^ = 00 ^ and has no point of inflexion except at infinity. 

(8) Frequency Curve. Type XII, — ■ 






y = y^ 



This J-curve arises along the E-line, or 5/32 — 6^1- 9 = 0. Its range is from 

X = o-(\/3 + /3i — v/3i) to X =^ — (t(\/3 + /3] + v A), and then its mean is the origin. 
When j8i is zero it degenerates into a rectangle (^'.e., at the rectangle point). 

In order to illustrate the nature of the curve more fully let us start from the 
general equation which arises when the denominator of the differential equation has 
real roots/^ i.e.^ 

where 

__ N m^^'m^"'' r (mi -h m.g + 2 ) 

^^ ~ 6 (mi + m^Y'' + '''' r (mi + 1 ) r (m^ + 1 ) 
and 

1 ^2 

the origin being the mode and h the range. 

Transferring to the mean as origin f this becomes 

a^''''a2'''Am^ + m2 + 2 ) \mi + m2 + 2 / ' 

* *Phil. Trans./ A, vol. 186, p. 369. 
t Loc, cif., p. 370. 

VOL. CCXVI.- — ^A. 8 P 
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where"^ 

h = la- {^1 (mi + m2+ 4)' + 16 (mi + m2 + 3)}'/\ 

^0 ^ N r(mi + m2 + 2) 

a{"'a^''' b''' + ^"^+ '^ r (mi + 1 ) r (m2 + 1 ) ' 

on substitution for a^ and a 2 as above. 

Now put mi + m2 = 0, or mg = —mi = m, say. 

Then 

,, _ N.r(2) /Mm + 1) Y/ h{l^m) 

/"&r(l+m)r(l--m)\ 2 "^ M 2 
while 

It remains to find m. 

Now mi and m^ are the roots of f 

m^— (r— 2)m + e— 7^ + 1 ~ 0, 
where 

3^-2^2 + 6 ' ' 
___ (mi+m2 + 2)^ 



4 + 1-^1 (mi + ms + 4) V( ^1 + m^ -f 3 ) 



and 



Hence, when mi + m^ = 0, we have 

r -= 2 or 5/3,- 6A" 9 = 0, the E4ine, 

e- 3/(/3i + 3). 



Whence 



m^ = 1 — e or m = ± A/ ^^ 



3 + /3i 



But r (2) — 1, and it is well known that 



r(i+m)r(i-m) 



mTT 



sm mx 



Thus 



sm < 



V 



/3i 



— W 



N ""''TV 3+/3i''| / ^(\/3+A + \/A)+^ \\/.t!V 



^ 2w \/A V(v/3+A-\/A) 






This is the full equation to the R-line J-curve, the mean being origin.J It requires 
for its determination only a knowledge of ^81, but we must be also certain that the 

■^ Log. cit., p. 369. 

t Log. cit., pp. 368-9. Deduced at once from m'^ ~ rm' + e == hy putting m' = m+ 1. 

I The sign of V/^i in o- ( >/3 + /^i ± J /Si) must be determined from that of (jls- 
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condition 5^2— 6^81 — 9 = is satisfied within the limits of random sampling. Its 
possibilities extend from ^1 = to /3i = 00. When /3j = 0^ 

N 
y = — 7r~5 the rectangle. 

Now consider what happens for any frequency curve of the limiting character 
when both j^i and /S^ become infinite, say, in the ratio /rJ^ = p/Sj. Then 

6(iD-l) 

S-2p ' 

and accordingly r will be finite if p is finite, except along the Type III. line. 
Accordingly for /3i = oo^ e will be zero. Thus the ratio of ^2 to /3i is from their 
values, 

which agrees with the above result for r. 

For the special case when r — 2, we have p = f, which agrees with the limiting 
ratio of ^J^i along the R-line. 

Now when e = we have from 



m^— (r— 2)m + e— r+ 1 = 



5 



= |-(r— 2 + r) = r—1 or —1. 



Thus from the equations on page 445, 

=. N r (7711 + ma + 2) / mi + 1 X V"'^ / t?i2 + 1 ) _ ^ Y'' 

~ 6 r(mi+l)r(m2+l)\mi + m2-f 2 6/ Wi + m2 + 2 h) ' 

^ N(m2+1) 1 /r xY~U x>-' 

h r (7^2+2) V 6/ \ />. 



N(m24-l) /^ xV'-^ 



if we change the sense of the axis of x and take x from to + b. 

Now in order that a- should be finite it is needful that h should be infinite when 

rvyii rr: 1 "for 

a' = lf{m,+ l)l{r{r^l)}. 

But if h be infinite, y = owing to the factor m^+l, for every value of x, except a; = 0. 
Hence the frequency is a concentrated lump at x = 0, and this involves of itself q- = 0. 

3 P 2 



448 PEOF. KAEL PEAESON ON SKEW VAEIATION. 

But if (7 = 0, h must be finite or zero, and these both again throw us back on a 
concentrated frequency at a:: = 0. 

Accordingly, when /3i and /^^ both become infinite, we deal with a concentrated 
frequency lump. But the ratio of /5i to ^2 will depend on the manner in which we 
have reached this limiting case. 

For example, if we are dealing with the correlations in samples of two drawn from 
a population in which the correlation is />, the frequency consists of two lumps, but 
as p approaches unity, one lump shrivels up, /3i and ^2 both become infinite, but their 
ratio is one of equality, i.e,, we approach infinity along the line /^g— /3i— 1 = 0. 

When we take samples of three from a population of correlation />, the frequency 
curves are U-shaped, but as p approaches unity the frequency concentrates in one leg 
of the U, /3i and (^2 both become indefinitely larger, but their ultimate ratio ^a/A 
appears to equal f,"^ The U-curve flattens down into an L-curve, of which the 
horizontal limb extends to infinity and becomes indefinitely thin, while the vertical 
limb contains all the frequency. 

(9) Scheme of Skew Frequency Curves Represented as a Diagram. — We are 
now able to considerably enlarge our diagrammatic representation of frequency 
curves. (See Diagram, Plate 1.) 

Every distribution is represented by its characteristic co-ordinates ^ and ^2? which 
must be positive, and therefore we need only deal with the positive /3i, /S^ quadrant. 
No frequency distribution at all can lie above the line /Sg— /3i~-l = ; this restriction 
removes more than half the positive quadrant. No frequency distribution can be 
adequately represented by one of the present system of skew curves, if it falls below 
the line 8/32—15/3^ — 36 — 0. The area below this line is therefore termed heterotypic. 
Heterotypic distributions are to say the least of it very rare, if they be not extremely 
improbable. We have seen that there is some reason to suppose that bimodal 
distributions would give rise to such heterotypic distributions, but with our present 
views as to frequency such distributions when they do not arise from the mere 
anomalies of random sampling are classed as heterogeneous, and supposed to be due 

to mixtures. 

Having thus limited our area at top and bottom we proceed to consider the various 

possibilities that arise. 

The /32-axis, where /3i = 0, is the axis of symmetrical frequency distributions. 
Possibilities begin at the B-line or the point [i^ =" 1. <^^ we have two equal concen- 
trated frequency blocks at any arbitrary distance h. This is the case of two 
alternative values, either of which is equally probable. For example, heads or tails 
in the repeated tossings of a single coin, or positive or negative perfect correlation 
in samples of two taken from a population of individuals bearing two uncorrelated 

■^' I use the word ''appears " advisedly, because the ratio has been obtained by determining the value 
of /52/i^i for high numerical value of p. The actual ratio for p = 1 depends upon approaching a limit 
in rather complicated elliptic integral expressions, which I have not yet accomplished. 
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characters. Below the point /Sg = 1, descending the jSg-axis, the two concentrated 
frequencies expand into a symmetrical U-curve. This is Type Ilg with the equation 



y = Vo {l-x^la?) 



~m 



and the criterion /Sj = 0, jSg < 1*8. 

m = i(9-5^,)/(3-/3,), 

and 

N r (l-m) 

When /^s = 1'8, m = 0, and we reach the ''rectangle-point " K Here |/o = 'N/{2a) 
and a- = (^/\/3. 

Samples of three individuals from a population whose individuals carry two 
uncorrelated characters give a symmetrical U-frequency for the coefficients of 
correlation of those characters in triplets of individuals. In this illustration ^2 = 1*5. 
Samples of four individuals from the same population give a rectangle for the 
frequency distribution of the coefficients of correlation. Passing still lower down the 
axis of symmetrical frequency the type is now Type 11^, or the limited range 
frequency curve 

and the criterion is ^i = 0, /Sg > 1*8 < 3. 
In this range m increases from to 00 ^ and 

m = i(5/3,-9)/(3-A) 

^^_ N ■ r(f+m) 

v27r(T r ( 1 + m) \/f + m 

We see that the range grows greater Sbsm approaches infinity, or ^2 = 3, when we 
reach G the Gaussian point {^i = 0, /^a = 3). 

If samples of ^ individuals be taken from an indefinitely large population in which 
the individuals carry two uncorrelated characters, then if ti be 5 or over, all the 
frequency curves of the correlation coefficients of these samples are of Type IIi^, only 
approaching the Gaussian when n is very considerable indeed. For example when 
ti = 25, ^82 = 27692, and the frequency is still a goodHvay from the Gaussian. 
When n = 400,/32 = 2'9850, it is thus fairly close to it, but is not coincident. 

^ It is, perhaps, worth noticing that for ^2 = 15/7 we obtain the ordinary parabola as a special type of 
frequency-curve. 
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After we have passed the Gaussian point we obtain curves of unhmited range of 
Type VII., of which the equation is 



y = ^o{l+^Y*^^) 






The range of ^2 is from 3 to 00 and 
falls from infinity to 2'5 ; while 

a^ = c^^. 2/32/0,-3), 

N T{m) 



y 



x/2wr(m--i)v/(m~f) 



Illustration of curves of Type VII. ^ are not infrequent in biological statistics. We 
see that the Gaussian is a mere point in an infinite range of symmetrical frequency 
curves, and a single point in a doubly infinite series of general frequency distribu- 
tions. 

Now let us consider the asymmetrical frequency curves displayed on the Diagram. 
If we approach from the " impossible area " we reach on the B-line the first available 
type of frequency— the alternative concentrated blocks. At one end of the B-line 
we have two equal isolated frequencies, and at the other a single isolated frequency. 

Crossing the B-line we reach the area of limited range U-shaped curves, i.e., Type 
lu, which has for its equation : 

This U-area extends as far as the upper branch of the loop of the biquadratic, the 
asymptote of which, 24/32-27/3i-38 = 0, is indicated by a broken line. In U-shaped 
frequency curves both mi and m^ are necessarily less than unity, for their product 
is e-r-\-l, which is less than unity and positive above the upper branch of the 
biquadratic {i.e., e-r-f-1 = O). Type 1^ is fitted as Type I. (see ' Phil. Trans.,' A, 
vol. 186, p. 367), and has been illustrated by me ('Roy. Soc. Proc.,' vol. 62, p. 287), 
by fitting curves of frequency to cloudiness. The frequency curves for the correlation 
coefficients of samples of three drawn from a population whose individuals have 
two characters of any degree of correlation are also skew U-shaped frequency 
curves, although their algebraic form has not the above simplicity. 

^ Type IIl was discussed in my first memoir, ^Phil. Trans.,' vol. 186, p. 372. Type IIj and Type VII. 
are briefly referred to in * Biometrika,V vol. lY., p. 174, but, unfortunately, with some rather disturbing 
misprints. They are correctly placed on Ehind's diagram, ' Biometrika,' vol. YIL, p. 131, but the 
formulge for fitting are not given. The formulae have been given for many years in lecture-notes, and 
the curves have been frequently used. 
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On the upper branch of the biquadratic loop we reach curves of Type VIII., i.e., 

discussed on p. 444 of the present memoir. Here m is less than unity. 

We now pass into the loop of the biquadratic between the upper branch and the 
R-line. Here we have J-curves, Type Ij, of the form 



?/ = ?/o (1 +^M)"'''^ {l-\'xfa2) 



■rw2 



where m^ is less than unity, and m^ is less than mg. 

Coming to the R-line, mi becomes equal to mg and we have Type XII., or 

discussed on p. 446 of the present memoir. Below the R-Hne, we return to Type Ij, 
but mj is now greater than m2.^ 

We now reach the lower branch of the biquadratic loop. This is divided into three 
portions by three critical points. The first portion is from the rectangle-point (R) to 
the line-point L. In this portion we start from R with the curve of Type IX. or, 

y -=yo {l+xfaY' 

for m = 0, or the rectangle, and proceed from that value to m = 1, which gives us the 
line (or triangle) ; the range is —a to 0. Since m is always < 1, the curve rises 
perpendicularly at x=:-a, and approximates to a trapezoidal form. The method of 
fitting is discussed in this memoir, p. 441. I'he fitting of the hne curve 

y = yo{l+x/a) 
is dealt with on p. 442. 

Beyond the hne-point L we have Type IX2 which differs in no way from Type IXj, 
except that m is now greater than unity, and there is contact of a rapidly increasing 
order at ^ = — a. 

When m = 00 we find Type X. the exponential curve, at the exponential point E. 
The fitting of this curve 

N 



has been discussed on p. 443. 



y == — e-^^" 

or 



> 



•^ For example, at the point /ig = i, pi =^ 2, between the R-line and upper branch, 

/ nr \0-2123 / / ^.\ 07123 

but at ^2 = 8, A = 4, between the E-line and the lower branch, 

. 0-401 1 



/ ^3X7-4011 // ^\( 
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Since E is the junction of several types, we turn to consider Type III. which is the 
curve found along the critical line 

2/3,-3/3i-"6 = 0. 

It passes through the Gaussian point G, and its equation is 

It is fully discussed in my first memoir; see 'Phil. Trans./ A, vol. 186, p. 373, 
et seq. 

Prom G to the exponential point E, p ranges from oo to zero, which latter value 
provides the exponential curve. After the exponential point p becomes negative and 
we reach Type IIIj, a J-curve with range limited in one direction only. This curve 
separates the doubly limited curves of Type Ij from curves of Type VIj, which lie 
below the line 2/32 -■ 3/3i — 6 = 0, and above the lower branch of the biquadratic loop. 
On this lower branch of the loop we have Type XL, or the form 

the range being from an arbitrary value 6 to oo, and m ranging from oo to 5. This 

type is fully discussed in the present memoir ; see p. 444. It continues right away 

along this branch of the biquadratic, but at ^^ = 22*725 and ji^ = 9*72, the eighth 

moment of the theoretical curve would become infinite, and accordingly the probable 

error of the fourth moment coefficient would become theoretically infinite. Thus since 

the fitting of the curve depends on the fourth moment its constants would cease to 

be reliable measures of the distribution. We enter at this point the *' heterotypic 

area," for this type of curve. ^ We have now two further areas to clear off, namely 

those between the Type III. line and the lower branch of the biquadratic loop. 

Above the former and below the latter we have the range of double limited frequency 

curves, i.e., Type I^, or 

y = yo{l+x/a,Y^^{l-x/a,Y\ 

This curve was fully discussed in my first memoir ('Phil. Trans.,^ A, vol. 186, 
p. 376, et seq.) wij and mg are both positive, and experience has shown that probably 
the bulk of all frequency distributions cluster into this area. 

Above the biquadratic loop and below the line 2/32— 3^i — 6 = 0? we have curves of 
Type VIj, or 



y = 



X 



{x-ay-' 



with range from x = a to x ~ oo 



* Of course, by using the actual eighth moment of the data, instead of the eighth moment of the 
theoretical curve, the standard deviation of the fourth moment would be finite, but this procedure 
would really indicate that, as far as the high moments are concerned, curve and data were discordant, and 
that we should not really be finding the probable error of a constant of our theoretical frequency curve. 
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They have been considered on p. 443 of the present memoir. Their full theory 
is precisely that of curves of Type VI. in general, discussed in the first supplement to 
my memoir on skew variation {' Phil. Trans., ' A, vol. 197, p. 448, et seq.). The only 
point to be emphasised is that the gg of Equation XIX. of that memoir in this area 
is negative and less than unity. The treatment is identical. 

Below both the Type III. line and the biquadratic, we have a space bounded by the 

cubic 

4(4^2-3A) (213,-3^,-6) = /S, (A+3)'. 

This is the area of Type VI. proper, i.e.^ 

y = Pq {x — aY'^/x^' 

with range from x = a to x =^ ^ ? 0^2 < 9'i being positive, and is fully discussed in the 
memoir just cited. 

The area of Type VI. is limited by the above cubic along which Type V., or, 

y = y^x~'^e~'^^''^ 

from iT = to iT = CO , describes the frequency. Its full consideration will be found in 
' Phil. Trans.,' A, vol. 197, p. 446, et seq. Below the Type V. cubic we reach the 
area of Type IV. curve, or 

y = y^e-''^^''~'^'^''^/{l + {x/ayy\ 

This has unlimited range in both directions and its treatment is fully discussed in 
my first memoir (' Phil. Trans.,' A, vol. 186, p. 376, et seq.). Theoretically, Types IV. 
and VI. describe all types lying below the line 2/32— 3/3i — 6 — 0- The objection to 
their use lies in the increasing probable errors of their constants, however good their 
general fit may be. To warn the statisticians of this, the line S/Sg— 15/3i — 36 = 0, is 
drawn on the diagram and the area below it is marked '^heterotypic area.'' I use 
this term to signify that it is doubtful whether my skew-frequency curves, depending 
only on the first four moments, can adequately describe distributions of types falling 
below this line ; they require the use of the fifth and higher moment coefficients. 
Their occurrence in practice, however, must be rare. 

It will be noticed that the line /32— /3i — 3 = is drawn through the Gaussian point. 
This is the relation which must be satisfied in the case of Poisson's exponential limit 
to the binomial. Hence, in the case of a distribution with /3i, /S^, near this line, it is 
worth while investigating whether the *' law of small numbers " is appropriate. Above 
this line every real binomial distribution, i.e., cases of p and q both positive and less 
than unity, and n positive (taking the binomial as {p + q)'') must lie, for 
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and the right-hand side is clearly less than unity. This limited area covered by the 
real binomial explains its relative infrequency as a descriptive series in practical 
statistics. If, however, we take the negative binomial as admissible, i.e.^ allow forms 
of the type 

{p — g)~^\ where p — q =" 1 

we extend the possible area of a binomial down to the line 2/32 — 3,8i — 6 = 0. 

Such a type of binomial is by no means of infrequent occurrence and can be more or 
less justified on a priori grounds.^ Below Type III. line, the values of.p and q become 
in the mathematical sense unreal, i,e,, imaginary. It is by no means certain, however, 
that such imaginary binomials with real moment coefficients may not, like imaginary 
hypergeometricals, give statistically good fits and be ultimately provided with 
physical interpretations. 

(lO) Concluding Remarks.' — It is very difficult to assert finality for any scientific 
investigation, but I trust this second supplement to my original memoir on skew 
variation of 1894 has garnered the last harvest of possible types within the limits 
proposed in that investigation. The object was the discovery of a system of frequency 
curves providing for every possible variation of the first four moment coefficients of 
a distribution and provision for their rapid treatment and calculation. Since 1894 
much has been done by the provision of tables of the new functions and improved 
tables of old functions necessary to carry this out.f Diagrams Hke that accom- 
panying this memoir, enable the statistician who has calculated the characteristic ^i 
and /32, to select at once the appropriate type, from the position of the point /3i, /^g in 
the ^1, ^2 plane. The first diagram, prepared by Mr. A. J. Rhind at my suggestion, 
has been long in use.J For the present very carefully prepared and much extended 
diagram I have to thank my colleague, Miss Adelaide G. Dayin, whose labours cannot 
fail to be appreciated by those having to handle practically statistical data. 

Since the publication of my original memoir on skew variation, many attempts have 
been made to express the nature of skew distributions by other systems of curves or 
by expansions in series. I have given careful attention to these competing systems 
and have discussed some of them elsewhere (' Biometrika,' vol. IV., pp. 169 to 212). 
My chief objections to them arise from the fact, that they either (i.) cover far less 
than the necessa.ry area ; or (ii.) involve constants the probable errors of which can be 
indefi.nitely great ; or (iii. ) involve constants the probable errors of which have not 
been or possibly cannot be calculated. In no case that I know of have they syste- 
matically been applied to extensive ranges of data, and the goodness of fit compared 
with that of other systems. The existence of such competing systems is at any rate 

* See 'Biometrika/ vol. IV., p. 209, and vol. XL, p. 139. 

t Now collected in " Tables for Statisticians and Biometricians,'' issued by the Cambridge University 
Press. 

I ' Biometrika,' vol. VIL, p. 131. 
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noteworthy evidence that to attempt to describe frequency by the Gaussian curve is 
hopelessly inadequate. It is strange how long it takes to uproot a prejudice of that 
character ! If the reader will turn again to the present diagram, he will see that the 
Gaussian frequency occupies a single point in an indefinitely extended area. Those 
who support the Gaussian theory have to prove that no distribution occurs at a 
distance from the point G of our diagram greater than could be accounted for by 
the probable errors of sampling of jii and ji^* These errors are known and have been 
tabled^ and that position is'quite untenable. Frequency distributions occur every day 
which by no manner of means can be described by Gaussian systems. 

It has been said that my skew curves suddenly change their algebraic type and 
that the statistician is puzzled by a slight change in the constants /3i and ^^ involving 
such radical changes in the equation to the type. But if the reader examines the 
present diagram, he will see that the main Types I^, Ij, I^, IV., VI. and VI j occur in 
areas, while the remaining types occur in the critical curved or straight lines which 
bound these areas. Special cases like the Gaussian, the exponential or the rectan- 
gular distributions occur where critical lines intersect. Now all these critical lines 
are really critical in the sense that a change of important physical significance occurs 
in this neighbourhood, and it is very unlikely that physical changes will be 
unaccompanied by sharp algebraical changes of form, such as are directly obvious 
in my curves, but are disguised by discontinuities in some of the proposed alternative 
expressions in series, t 

Any one illustration that the frequencies which occur in actual statistical data can 
practically cover the whole possible area of the /S^, /Sg planes, and can present 
frequency distributions which change abruptly in type, will suffice to confute both 
the argument that frequency is concentrated in or near the Gaussian point, and the 
argument that it is undesirable that skew-frequency curves should be so manifold in 
form, although how they are to change from U to J, to " cocked hat," to rectangle 
and to exponential forms without this abrupt change will be a puzzling problem to 
solve for the professed mathematician. An illustration of this character has been 
several times referred to in the course of this paper. Let us suppose there exists 
an indefinitely large population, each individual of which carries any number of 
characteristics which are correlated together, for simplicity we will say according to the 
normal law. We may suppose that there are enough pairs of characters to give all 
values of the correlation p from + 1 to — 1 . 

^ ' Tables for Statisticians and Biometricians/ pp. 68-71. 

t An analogy might be given in the case of the expression of a "cocked-hat" shape of finite range 
and a U-shaped distribution by a single Fourier's series. Here the trigonometrical expression by the 
Fourier's series would be superficially the same if kept in symbolic form, while the algebraic form of the 
U-curve would require two vertical asymptotes and its equation would be wholly different from that of 
the " cocked-hat' form. The Fourier expression would only disguise the real discontinuity. In the same 
manner real discontinuity of form is disguised in the series which express skew frequency in. terms 
of a long series of moment coefficients. 

3 Q 2 
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Now from this population we will take a large number m of samples of n 
individuals. If in each one of these samples we calculate the correlation, r, between 
two variateSj then r will not be equal to the value of p in the sampled population, but 
the m samples will give a frequency curve for r, which is limited in range between 
+ 1 and "—1 and is determined by n the number of individuals in the sample and by 
p the correlation of the characters In the indefinitely large population sampled. We 
thus obtain a doubly infinite series of frequency distributions. The general theory of 
such distributions has been worked out by ''Student'' (' Biometrika' vol. VI., p. 302, 
et seq.), Mr. H. E. Soper {Ibid., vol. IX., p. 91, et seq.), and Mr. E. A. FiSHEB(i6'id, 
vol. X., p. 507 yet seq.). The actual forms of the frequency curves are not usually 
expressible by simple single functions, but the ordinates and the /3i, ^g admit of 
numerical determination. The calculations are extremely laborious, but up to the 
present the members of my laboratory staff have calculated some 270 frequency 
curves with nearly 40 ordinates each for values of p ranging from to 1, and of 
n from 2 to 400. The great bulk of these curves show no approach to normality. 
The values of ^j, ^^ range from points on the B4ine down to infinity, the distri- 
butions contain concentrated blocks, U-shaped curves, J-shaped curves, rectangles, 
trapezoid-like forms and every variety of skewness in doubly limited range curves. 
Only in cases where n is very considerable and p is neither a positive nor a negative 
high correlation is there an approximation to the Gaussian. For a series of curves 
in which ^^ can be 5 and ^g ~ 9,— or both, if we will — ten times these amounts, 
it is idle to talk about the value of the Gaussian curve {/3i rn o, ^2 = 3) in describing 
variation. These frequency curves can be actually obtained by experimental sampling, 
although the process is laborious, and indeed were so obtained in the first place. '^' 
They arise from observation and experiment. The remarkable point about them is 
that they illustrate all the types we have been discussing and justify sharp transitions 
in algebraic forms by showing that such transitions correspond to actual physical 
facts arising from experimental statistical data. The whole illustration, details of 
which will shortly be published, indicates the evil of implicit reliance on a classical 
theory, . 

The Gaussian theory of error has, with great weight of authority, been applied to 
determine significant differences in statistical constants. The theory of the 
''probable error" must be justified in the case of each statistical constant to which it 
is applied. Psychologists have been busy discussing the differences found in mental 
correlations deduced from small samples on the basis of significance judged by the 
Gaussian theory of probable error. That theory has practically no application, as the 
'' probable error" has really no meaning in the case of the bulk of the samples dealt 
with. Applications of the theory of probable error in other sciences than psychology 
to experimental results based on small samples will readily occur to the reader. The 
conclusions may be correct or incorrect, but they are unquestionably based on an 

^ ' Biometrika,' vol. VL, pp. 305-7. 
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inflation of the Gaussian point, G, to cover all that may be happening in the whole 
area of possible 0i, ^2 points in our diagram. It cannot at present be too often 
emphasised that such inflation is illegitimate, and that, as Dr. Isserlis has recently 
indicated, "^^ the assumption that the distribution curves of statistical constants follow 
the Gaussian curve is not legitimate, especially in the case of '* small samples,'' which 
not only for many commercial purposes, e,g,, experimental brewing, but in numerous 
branches of science, e.g., psychology, astronomy, and even physics, are all that 
economy of money or time permits of being recorded. 



•7^ i 



Roy. Soc. Proc.,' A, vol. 92, p. 23. 
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